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Abstract 

A stabilizing Byzantine single-writer single-reader (SWSR) regular register, which stabilizes after 
the first invoked write operation, is first presented. Then, new/old ordering inversions are eliminated by 
the use of a (bounded) sequence number for writes, obtaining a practically stabilizing SWSR atomic 
register. A practically stabilizing Byzantine single-writer multi-reader (SWMR) atomic register is 
then obtained by using several copies of SWSR atomic registers. Finally, bounded time-stamps, with 
a time-stamp per writer, together with SWMR atomic registers, are used to construct a practically 
stabilizing Byzantine multi-writer multi-reader (MWMR) atomic register. In a system of n servers 
implementing an atomic register, and in addition to transient failures, the constructions tolerate t < 
n/8 Byzantine servers if communication is asynchronous, and t < n /3 Byzantine servers if it is 
synchronous. The noteworthy feature of the proposed algorithms is that (to our knowledge) these 
are the first that build an atomic read/write storage on top of asynchronous servers prone to transient 
failures, and where up to t of them can be Byzantine. 


Keywords Asynchronous message-passing system. Atomic read/write register, Byzantine server. Clients/servers 
architecture. Distributed algorithm. Fault-tolerance, Regular read/write register. Self-stabilization, 

Transient failures. 


1 Introduction 

Byzantine processes and self-stabilization Algorithms that tolerate Byzantine faults are of extreme in¬ 
terest, as they can tolerate malicious takeovers of portions of the system, and still achieve the desired goal. 
Moreover, as the program executed by several of the participants may include programming mistakes, it 
is possible that these participants will (unintentionally) behave in a malicious way. Obviously, when all 
participants exhibit Byzantine arbitrary behavior, the system output will be arbitrary too. 

Usually, lower bounds on the number of Byzantine participants are used as part of the algorithm 
design assumptions. The cases in which the lower bound is not respected are not considered, as the 
system can reach an arbitrary configuration due to the possibly overwhelming malicious actions. Assume 
that some of the Byzantine participants regain consistency (possibly by rebooting, running anti-virus 
software, environment change) so that the assumed threshold on the number of Byzantine participants is 
now respected. Will the system regain consistency, from this arbitrary configuration? Or in other words 
will the system stabilize to a correct behavior? 

Related work and aim of the paper An active research area concerns the construction of a Byzantine- 
tolerant disk storage (e.g..ll2ll4l[l4l to cite a few). Many of these papers consider registers built on top of 
duplicated disks (servers), which are accessed by clients, and where disks and clients may exhibit different 
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type of failures. The construction of a reliable shared memory on top of a Byzantine message-passing 
system is addressed in fTOll . 

Recently, several works investigated stabilizing Byzantine algorithms e.g., |[3] [El EU ■ The first of these 
papers is the most related to our research, as it constructs a stabilizing Byzantine multi-writer multi-reader 
regular register, where t out of n servers (with n > 5t + 1) can be Byzantine. Such a construction relies 
on the write operation quiescence assumption, i.e., it is assumed that, after a burst of write operations exe¬ 
cuted by the writer, there exists a sufficiently long period during which the writer does not invoke the write 
operation. Differently, we construct a practically stabilizing Byzantine multi-writer multi-reader atomic 
register in a client/server system which is able to tolerate transient failures and up to t Byzantine servers. 
Given t, our solutions require n > 8t + 1 servers when client/server communication is asynchronous, 
and only n > 3t + 1 servers when it is synchronous. This gap comes from the fact that, as they provide 
bounds on message transfer delays, synchronous settings allows readers and writers to use timers. As far 
as we known, our construction is the first that builds a distributed atomic read/write memory on top of 
asynchronous servers, which communicate by message-passing, can suffer transient failures, and where 
some of them can exhibit a Byzantine behavior. 

Roadmap The paper is made up of [6] sections. The computing model and the problem which is ad¬ 
dressed are presented in Section [JJ Then, Section [3]presents and proves correct an algorithm that builds a 
stabilizing single-writer single-reader (SWSR) regular register. This algorithm is extended in Section[4]to 
obtain an SWSR atomic register, and Section [5] shows how to go from “single-reader” to “multi-reader” 
and from “single-writer” to “multi-writer”. Finally Section [6]concludes the paper. Due to page limitation, 
the synchronous communication case and proofs can be found in appendices. 

2 Computing Model and the Problem we Want to Solve 

2.1 Computing model 

Basic system model The basic system model we consider consists of (n + 2) asynchronous sequential 
processes. One of them is called “writer” (denoted p w ), another is called “reader” (denoted p r ), while the 
n others are called “servers” (denoted si, ..., s n ). 

From a communication point of view, there are 4 n directed asynchronous communication links, con¬ 
necting each server to p w and p r (one in each direction). Each link is FIFO and reliable (neither loss, 
corruption, duplication, nor creation of messages). 

It is assumed that processing times are negligible, and are consequently assumed to take zero time. 
Only message transfers takes time. 

This basic model will be later enriched in two directions: one concerning client processes to have m 
reader/writer processes, and a second concerning the synchrony of the communication links. 

Failure model At most i < re/8 servers can commit Byzantine failure^]. Let us remember that a 
server commits a Byzantine failure when it behaves arbitrarily ifTTl . Classical examples of a Byzantine 
behavior consists in sending erroneous values, not sending a message when this should be done, stopping 
its execution, etc. 

In addition to the possibility of Byzantine servers, the local variables of any process (writer, reader, 
servers) can suffer transient failures. This means that their values can be arbitrarily modified [0. It is 
nevertheless assumed that there is a finite time T no _ tr (which remains always unknown to the processes) 
after which there are no more transient failure^]. 

From a terminology point of view, a server is correct if it does not commit Byzantine failures. Hence, 
as the reader and the writer, any correct server can suffer transient failures. 

'Actually, Byzantine failures can be “mobile” aana .This means that, if, after some time, a server that committed Byzantine 
failures, starts behaving correctly, a server that was previously behaving correctly can become Byzantine. This “failure mobility” 
can occur at any time during the periods where there is no pending read or write operation, issued by p w or p r . In fact, in any 
case, the system is guaranteed to converge to exhibit the desired behavior once the assumptions concerning the system hold again 
for a “long enough” period of time. 

2 This assumption is required to ensure that, despite asynchrony and Byzantine behaviors, the problem we are interested in 
can be solved. In fact, if the time between two successive transient faults is long enough, the system converges and produces 
useful outputs between transient failures. 
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Configurations and executions Each process (writer, reader, or server) is a state machine, enriched with 
the operations send and receive. Its state (called “local state”) is defined by the current values of its local 
variables. The state of a directed link consists of the messages that have been sent on this link, and are not 
yet received. 

A configuration (or global state) is composed of the local state of each process and the state of each 
link. Due to the “transient failures” behavioral assumption, the initial configuration can be arbitrary. 

Underlying ss-broadcast abstraction It is assumed that the system has a built-in communication 
abstraction, denoted ss-broadcast, that provides the reader and the writer with an operation denoted 
ss_broadcast(), and each server with a matching operation denoted ss_deliver(). When the reader or the 
writer (resp., server) uses this broadcast abstraction, we consequently say that it “ss-broadcasts” (resp., 
“ss-delivers”) a message. This communication abstraction is defined by the following properties. 

• Termination. If the reader or the writer invoke ss_broadcast(m) then such invocation terminates. 

• Eventual delivery. If the reader or the writer invokes ss_broadcast(m) then every correct server 
eventually ss-delivers m. 

• Synchronized delivery. If a process p x (reader or writer) invokes ss_broadcast(m) at time rf and 
returns from this invocation at time rf , then there exists a set S of (n — 2 1) correct servers, such 
that, for each Si G S, there exists a time r(i) such that rf < r(i) < rf at which s* executed 
ss_delivery(m). 

• No duplication. An invocation of ss_broadcast(m) by a process p (reader or writer) results in at 
most one ss_deliver(m) at any correct server 

• Validity. If a correct server s t ss-delivers a message m from p (reader or writer), then either p 
ss-broadcasts m, or m belongs to the initial state of the corresponding link. 

• Order delivery. Any correct server ss-delivers the messages ss-broadcast by a process p x (reader or 
writer) in the order in which they have been ss-broadcast. 

Implementations of such a broadcast abstraction are presented in Section 4.2 of @|, (see also |[3 0). 
They rely on bounded capacity communication linkd 

2.2 Problem Statement 

Construction of a read/write register and assumptions The problem in which we are interested is the 
construction of a stabilizing server-based atomic register REG, that can be written by the writer p w , and 
read by the reader p r . From an abstraction point of view, the register provides the writer with an operation 
write(v), where the input parameter v is the new value of the register, and the reader with an operation 
readQ, which returns the value of the register. 

The construction is done incrementally. A regular register is first built. Then this construction is 
enriched to obtain an atomic register. Both constructions assume that (a) there is a time after which there 
is no more transient failures (instant T no _ tr ), and (b) the writer invokes at least once the write() operation 
after T no _t r . According to case (b), let t\ w > T no _t r be the time at which the first write invoked after T no _ tr 
terminates. 

Concurrent operations, read and write sequences Let W and R be the executions of a REG. write() 
operation by the writer and REG. read() operation by the reader, respectively. If W and R overlap in 
time, they are said to be concurrent. If they do not overlap, they are said to be sequential. 

Let us observe that, as the writer p w (resp., reader p r ) is sequential, the set of invocations of the 
operation writeQ (resp., read()) defines a sequence S\y (resp., Sr). 

Stabilizing regular register A regular read/write register is defined by the following propertied 

’Roughly speaking, in a simple implementation, when a message m send operation is invoked by a correct process pt to 
a correct process pj, pi repeatedly send the packet (0, m) to pj until receiving (cap T- 1) packets from pj (where cap is the 
maximal number of packets in transit from pi to pj and back). Then pi repeatedly sends the packets (1, m) to pj until receiving 
( cap+ 1) packets from pj. Process pj sends (bit, ack ) only when receiving (bit, m), and executes ss_deliver(m) when receiving 
the packet (1, m) immediately after receiving the packet (0, m). 

4 These definitions of a stabilizing regular register, and a stabilizing atomic register, are straightforward extensions of the basic 
definitions given in CD- 
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• Liveness. Any invocation of REG .writeQ or REG. read () terminates. 

• Eventual regularity. There is a finite time T sta b > T \w after which each REG. read() R returns a 
value v that was written by a REG. writeQ operation W that is (a) the last write operation executed 
before R, or (b) a write operation concurrent with R. 

Let us observe that, as there is at least one invocation of REG. writeQ (assumption), and any invocation 
of REG. writeQ terminates (liveness), t\ exists. Let us also observe that, before T sta b, read operations 
can return arbitrary values. If a read/write register is regular, we say that the value returned by each of its 
read operations is regular. 

The duration r s t a b — Trwj.r is the time needed for the system to stabilize. After T sta b, no invocation 
of REG. readQ returns an arbitrary value. But, while after T sta b regularity prevents REG from returning 
too “old” values, it still allows REG to return values in an order different from their writing order, as 
described in Figure [0 The first read returns the value 1 (whose write is concurrent with it), while the 
second read returns the value 0 (which was the last value written before it starts). This phenomenon is 
known under the name “new/old inversion”. 

REG. write(O) REG. write(l) REG. writeQ) 

Pw — - > < - > < - > > 

REG. read(l) REG. read(O) 

p r - < > - < > ->■ 

Figure 1: Regular register: new/old inversion 


Stabilizing atomic register Such a register is a stabilizing regular register that, after some time, does 
not allow new/old inversion. It is defined by the following properties. 

• Liveness. Any invocation of REG. writeQ or REG. readQ terminates. 

• Eventual atomicity. There is a finite time r s t a b > T \ w after which any invocation of REG .readQ 
returns a regular value, and there are no two invocations of REG. readQ that return new/old inverted 
values. 

Informally, this means that it is possible to merge sequences Sw and Sr to obtain a sequence S where, 
after time r sta b, each read operation returns the last value written by the closest write operation that 
precedes it. 

Notation and other read/write registers The previous registers are called stabilizing regular (or atomic) 
single-writer single-reader (SWSR) registers. The SWSR atomic register will be used in Section [5] as a 
building block to construct stabilizing atomic single-writer multi-reader (SWMR) registers, and stabilizing 
atomic multi-writer multi-reader (MWMR) registers. 

3 Construction of a Stabilizing SWSR Regular Register 

This section presents a stabilizing algorithm that implements a single-writer single-reader regular register 
in the system model introduced in Section [2T1 

3.1 Algorithm 

The algorithms implementing the operations REG. writeQ, IlEG.readQ, and the behavior of the servers 
Si, is described in Figure [2j The writer and the reader terminate their operations when they execute the 
statement return() (line [06] for the writer, and lines [13] or [15] for the reader). 

Local variables and update messages Each server s*, 1 < i < n, manages two local variables, which 
locally define its internal representation of the constructed regular register REG. 

• The aim of the variable last-vali is to store the last value written by the writer, as known by s r . To 
that end, when it invokes REG.vjr\te(v), the writer ss-broadcasts the message WRITE (?;) to inform 
the servers of the new value v. 

• The aim of the variable helping jvali is to contain the last value ss-broadcast by the writer to each 
server s t , when identifying that the reader requests assistance as write operations are too frequent. 
This variable is reset to _L at the beginning of every new read. 
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There is no specific local variable managed by the writer. As far as the reader is concerned, it has to 
manage a single local variable. 

• new-read r is a Boolean flag, that, when true, demands each server to reset to _L its helping vari¬ 
able helping jvali. To this end, the reader ss-broadcasts the message READ ( new-read r ), where 
new-read r = true, each time it starts a new read operation. 


operation write (v) is % issued by the writer p w % 

(01) ss_broadcast WRITE (v) to all servers; 

(02) wait (messages ACK_WRITE (helpingjval) received from (n — t) different servers); 
(03) if -i(3 u> A 1 suc h that helpingjval = w for (47 + 1) of the previous messages) 

(04) then ss_broadcast NEW_HELP_val(v) to all servers 
(05) end if; 

(06) return)). 

operation read () is % issued by the reader p r % 

(07) new-read r true; 

(08) while (true) do 

(09) ss.broadcast READ (newjreadr) to all servers; 

(10) new.readr •<— false; 

(11) wait (messages ACK_READ (lastjval, helpingjval) 

received from (n — t) different servers); 

(12) if ((27 + 1) of the previous messages have the same lastjval) 

(13) then let v be this value; return)?;) % the value returned is regular or atomic % 

(14) else if ((2 1 + 1) of the previous messages have the same helpingjval A _L) 

(15) then let w be this value; return(w) % the value returned is atomic % 

(16) end if 

(17) end if 

(18) end while. 


when write ( v ) is ss_delivered fromp™ do 

(19) lastjvah <— v, 

(20) send ACK.WRITE (helping jvah) to p w . 

when new_help_val (v) is ss.delivered fromp™ do 

(21) helping-vali «— v. 

when read ( newjread ) is ss_delivered fromp r do 

(22) if (newjread) then helping jvah <— _L end if; 

(23) send ACK.READ (lastjvah, helping jvah) to p r . 


Figure 2: Byzantine-tolerant stabilizing SWSR regular register 


Algorithm implementing REG. write)) As already said, when the writer invokes REG .\i\ir\te(v), it first 
ss-broadcasts the message write)?;) (line lOTI). and waits until it received an acknowledgment message 
ACK .WRITE (helpingjval) from (n — t) servers, (i.e., from at least (n — 2f) correct servers) (linelQ2T). 

When a server s* ss-delivers the message write)?;), it updates last-vali (line [19]), and sends by return 
(line [20l) the acknowledgment ACK _\vRI T E)/?e (/?%ng _?; a / ,;) to give the writer information on the state of 
the reader (namely, helpingjuali = _L means that the reader started a new read operation, and accordingly 
helpingjvali needs to be refreshed). 

When the writer stops waiting, it checks if it has received the same value helpingjval =h _L from at 
least (4f + 1) different servers (line l03l). If this predicate is false, the local variables helping jvali of the 
servers Sj needs to be refreshed. To this end, the writer ss-broadcasts the message ne\v_heep_val)?;) to 
inform them that, from now on, they must consider v as the new helping value (lines lQ4l and [2Tb. 

Algorithm implementing REG. read)) When the reader invokes /(EG. read)), it sets newjread r to 
true (line 1(771) and enters a while loop (lines lQ8l and ITSl). that it will exit at line [13] or [15] Once in the loop 
body, the reader starts a new inquiry by ss-broadcasting the message READ (newjread r ) to the servers. 
If newjreadr = true, the message is related to a new read operation (line |07]): if newjreadr = false, 
it is from the same read operation as before (line ITT)]) . Then, the reader waits until it has received an 
acknowledgment message ACK _READ(/asf_ua(, helpingjval ) from (n — t ) servers (line mi. 
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When a server Si receives the message READ (newjread r ), it resets helping juaf to J_ if this message 
indicates a new read operation started (line 1221) . In all cases (i.e., whatever the value of newjread r ), it 
sends by return its current local state in the message ACK _RE AD (lastjvali, helpingjualf) (linel23l). 

When the reader stops waiting, it returns the value v if the field lastjval of (2t + 1) messages ACK 
_read() is equal to v (lines [T2lfT3T) . Otherwise it returns the value w if the field helping jual of (2 1 + 1) 
messages ACK _read() is equal to w f _L (lines IT411T51) . If none of these predicates is satisfied, the reader 
re-enters the loop body. 

Remark on the reception order of the messages ACK_write() and ACK_READ() It is important 
to notice that, thanks to the properties of the ss-broadcast abstraction, and the fact that the links are 
FIFO, we have the following. When the writer invokes ss_broadcast(), and later waits for associated 
acknowledgments ACK_write() from (n — t) servers (lines IQTH021) . the sequence of acknowledgments 
received from each correct server matches the sequence of ss_broadcast() it issued (the same holds for 
the reader and the acknowledgments ACK_READ(), lines IMlfTTI) . This means that ss_broadcast() and the 
associated acknowledgments do not need to carry sequence numbers. 

3.2 Proof of the construction 

All the poofs assume n > 8t + 1. 

Lemma 1. Any invocation o/write () terminates. 

Proof Due to the ss-broadcast termination property, the writer cannot block forever when it invokes 
ss_broadcast() at line[0l]or line [04] As far the wait statement of line[02]is concerned, we have the follow¬ 
ing: due to the ss-broadcast eventual delivery property, eventually at least (n — t) non-Byzantine servers 
ss-deliver the message WRITE0 ss-broadcast by the writer, and then they will eventually answer by return¬ 
ing the acknowledgment message ACK_write(), which concludes the proof of the lemma. ^LemmaQ] 

Lemma 2. Any invocation of read () terminates. 

Proof Using the same reasoning as in LemmaQ] it follows that the reader cannot block forever in the 
wait statement of line QT| So, the proof consists in showing that the predicate of line [12] or the one oflTfl 
becomes eventually true. The rest of the proof is by contradiction. Let R be the first invocation of read () 
that does not terminate and let us consider an execution of the loop body after time T sta b- 

Claim C. At the time at which a write that started after T n „j r terminates, there are (a) at least (n — 
2 1) correct servers s, such that lastjvali = v, and (b) at least (3 1 + 1) correct servers Sj such that 
helpingjvalj = w / _L. 

Proof of the claim. Let us consider a write started after T no _t r and let t w be the time at which such write 
terminates. Considering that after T no _t r there are no more transient failures and due to the synchronized 
delivery property of the ss-broadcast we have that at time t w there are at least (n — 2t) correct servers s t 
such that lastjvali = v. Moreover, if the predicate of line [03]is true, it follows from (a) the synchronized 
delivery property of the ss-broadcast of the message new_help_val() (line l04l ). and (b) the fact that 
n — 2t > 3t + 1, that at least (3 1 + 1) correct servers Sj are such that helping jvalj = w / _L. If predicate 
of line & false, there are (ft + 1) servers that sent ACK_WRlTE(m) where w f j_ (line [20]). from which 
we conclude that at least (3t + 1) of them are correct and are such that helping jvalj = w/JL. End of 
the proof of the claim C. 

Let us consider the last write that terminated before R started, and let us assume it wrote x. Due 
to part (a) of Claim C, just after this write terminated, at least (n — 2 1) correct servers s t are such that 
lastjvali = x. If no write is concurrent with R, as R receives messages ACK_READ (7asf _ua(, —) from 
(n — t) servers at line [TT] (i.e., from at least (n — 2 1) correct servers), it follows from the fact that the 
intersection of any two sets Q\ and Q2 of (n — 2 1) correct servers (the set Q1 of correct servers s t such 
that lastjvali = x, and the set Q2 of correct servers from which R receives ACK_READ ( lastxval , —)) 
contains at least (2t + 1) correct servers, that R terminates at lines IT2HT31 

Let us now assume that there is exactly one write that is concurrent with R, and let y be the value 
it writes. Due to the synchronized delivery property of ss-broadcast, R first resets to ± the variables 
helpingjuali of at least n — 2t > 6f + 1 correct servers s t (lines [07] [09] and l22l) . and then receives 
(line [TT]) messages ACK_READ (lastjual, —) from at least n — 2t > 6t + 1 correct servers. We show that 
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at least (2 1 + 1) of these messages carry either x or y, from which R terminates at lines IT2¥T3l Due to 
part (a) of Claim C, there were at least n — It > 6t + 1 correct servers s t such that lastjvali = x when 
the write of x finished. Let Q be this set of servers. II receives messages ACK_READ (lastjval, —) from at 
least (4 1 + 1) servers in Q. Due to the operation write(y) (concurrent with R), variables lastjvali of some 
of these servers may have been updated to the value y. Hence, some of the previous (4 1 + 1) messages 
ACK_READ((asf_na(, —) received by R carry x, while others cany y. Hence, at least (2 1 + 1) of them 
carry either x or y, and R terminates at lines fl2l[T3l 

Let us finally consider the case where there are more than one write concurrent with R. When R ter¬ 
minates its invocation of ss_broadcast READ(true) (there is only one such invocation per read operation, 
line l09l ). the local variables helpingjvali of (n — 2 1) correct servers are equal to _L. Let Q' be this set 
of servers. (The proof of this statement is the same as the proof appearing in the first part of claim C.) 
Hence, when this ss-broadcast terminated, the messages ACK_READ(— , helping_val {) sent by each server 
Si £ Q' flinel23T). is such that helping jvali = _L. Let us consider the first write (e.g., write(z)) that occurs 
after the servers s t £ Q' have set helping jvali to _L. This write receives (n — t) messages ACK_WRITE 
(helping jval), and at least (4f +1) of them are from servers in Q' and carry helping jval = _L. Hence the 
predicate of line [03]is satisfied, and the writer issues ss_broadcast new_help_val(z). If later (i.e., after 
the invocation of write(z) terminated), there are other invocations of writeQ concurrent with R, none of 
them will execute line [04] This is due to the fact that R does not reset the variables helping jvali to _L, and 
the (n — t) messages ACK_WRITE (helping jval) sent by the servers at line[20]are such that at most t are 
from Byzantine servers, and at least (4t +1) carry z, from which follows that there is a finite time tr after 
which the variables helping jvali of the correct servers are no longer modified. Let us finally consider the 
first invocation of ss_broadcast ( newjread r ) issued by R after tr, such that newjread r = false. It fol¬ 
lows from the previous discussion that, among the (n — t ) messages ACK_read(—, helping jval) received 
by II, at most t (the ones from Byzantine servers) carry arbitrary values, and at least (n — t) — 3t > 4t + 1 
carry the value z. When this occurs, R terminates at lines fl4lfl5l D Lemma[2] 

Lemma 3. Let t < n/8. There is a finite time r sta j, > T\ w after which each read invocation R returns a 
value v that was written by a write operation W, which is (a) the last write operation executed before R, 
or (b) a write operation concurrent with R. 

Proof Let us assume that a read operation R returns z, a value different from the value v of the last 
completed write prior to R, and from any value u of a concurrent write. Let us consider the first write 
concurrent with R. For R to return z, the reader must receive (It + 1) messages ACK_read(z, —) or 
(2t + 1) messages ACK_READ(— , z). However, immediately following the termination of the write of v 
there were (n — It) correct servers .s, with last-val, = v. Thus, following the termination of the write of 
v, and until the termination of the next write of some value u, the reader cannot receive (It + 1) values 
for a value z different from v and u. The above argument holds for the second concurrent write, where we 
start with (n — It) values of a, and so on and so forth. Ll Lemma [Tj] 

Theorem 1. Let t < n/8. The algorithm described in Figure [2] implements a stabilizing regular register 
in the presence of at most t Byzantine servers. (The proof follows from Lemmas Q] El and [3]) 

3.3 The case of synchronous links 

Let us consider a communication model where the links are synchronous. Synchronous means here that 
each link, connecting the reader or the writer and a correct server, is timely i.e., there is an upper bound on 
message transfer delays and this bound is known by the processes. When considering the construction of 
an SWSR regular register, this allows the reader or the writer to know how long it has to wait for a round 
trip delay with respect to the correct servers, and consequently use a timer with an appropriate timeout 
value. 

It appears that the previous algorithm can be adapted, with very a simple modification, to this syn¬ 
chronous communication model to build a stabilizing SWSR regular register. Due to page limitation, this 
algorithm is described and proved correct in Appendix [A] The important result is the following theorem, 
which states that, in such a synchrony setting, up to t < n/3 servers can commit Byzantine failures. 
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Theorem 2. Let t < n/ 3. The algorithm described in Figure [5] implements a stabilizing regular register 
in the presence of at most t Byzantine setyers. (Proof in Appendix [A]) 

4 Construction of a Stabilizing SWSR Practically Atomic Register 

Practically stabilizing SWSR atomic register A stabilizing SWSR practically atomic register is a 
stabilizing SWSR regular register with no new/old inversions as long as the number of writes between 
two successive reads (that are not executed concurrently with any write) is less than a given constant 
called system-life-span (e.g., 2 64 ) CD. 

This section presents a practically stabilizing SWSR atomic register that stabilizes after a read that 
(a) is not concurrent with a write, and (b) follows the first write that follows the last transient failure. Its 
operations are denoted prac_at_write() and prac_at_read(). 

Algorithm The stabilizing SWSR practically atomic register algorithm is described in Figure [3] It is an 
extension of the algorithm implementing a stabilizing regular register presented Figure [2j The lines with 
the same number xy are exactly the same in both algorithms. A line numbered N.c is a new line, while a 
line numbered xyMz corresponds to a modification of the line xy of Figure [2j 

Underlying principle To obtain an algorithm implementing such a register, the main idea is to count 
the invocations of prac_at_write() so that no new/old inversion can occur if the reader traces the sequence 
number attached to each written value, and exchange an older value with a newer that is already known. 
This is the role of the write sequence number denoted wsn . Hence, the data value v appearing in Figured 
in now replaced by the pair (wsn , v ) in Figure [3] Therefore, last wed.-, contains now such a pair, and 
helpingjvali contains now either such a pair, or the default value _L. 

Special care must be taken to bound wsn so that there is no ambiguity on its current value. Hence, a 
relation A on sequence numbers has to be defined, such that it always reflects the write order of the values 
they are associated with. This relation is defined as follows: given two integer x and y (e.g., in range 
[0, 2 128 + 1]), x > c d y iff the clockwise distance (hence the subscript cd) from y to x is smaller than their 
anti-clock distance; moreover, x > cc ] y if x > cd y and x f y. Such precedence relation is used at lines N6 
and [T31V12 to compare the highest previously received sequence number pwsn with the current one and 
to update it (lines N6.fl3lV[2. andfl5lVI). As transient failures may corrupt counter values, those must be 
automatically corrected. This is done as follows. After the first read, which follows a write invocation and 
does not overlap a write, it holds that the local pair ( pwsn,pv ) stored by the reader reflects the last read 
correct value. Thus, the bookkeeping of pwsn, pv, and the values of wsn and v, which are read, reflects 
the right value ordering which allow their correct reordering, thereby providing the writer and the reader 
with an atomic register. 

The aim of the lines N2-N7 is to do a sanity check for the the local pair ( pwsn,pv ) managed by the 
reader. To that end, the reader ss-broadcasts the message READ (false), and wait for (n — t) associated 
acknowledgments ACK_READ (—, helpingjval) (lines N2-N3). If (2t + 1) of these messages carry the 
same pair helping jval = (wsn , v ), and wsn is smaller than pvsn, then the reader adopts this pair as 
current value of (pvsn, pv). This is because, if (2 1 + 1) of these messages cany the same pair, they reflect 
the last value written, and therefore carry the correct wsn. Hence, the “if” statement in line N6, whose 
aim is to refresh the pair (pvsn,pv). This preliminary sanity check, which relies on values provided by 
the servers, helps the rest of the read algorithm (lines lQ7TfT8l which are nearly the same as the ones of 
Figure [2]) prevent new/old inversions from occurring. 

Remark Due to page limitation, the proof of the previous construction is given in Appendix |B] Let us 
notice that the “synchronous link” algorithm designed for n > 3f + 1 processes, has a similar extension, 
which builds an SWSR atomic register version. 








operation prac_at_write (v) is % issued by the writer p w % 

(Nl) wsn <— ( wsn + 1) mod (2 64 + 1); 

COM) ss.broadcast WRITE ( wsn , v) to all servers; 

@2) wait (messages ACK_WRITE (helpingjval) received from (n — t) different servers); 
d03t if such that helping-val = w for (4t + 1) of the previous messages) 

d04lvl) then ss.broadcast NEW_HELP_VAL( wsn . v) to all servers 

( l05l > end if 
return(). 

operation prac_at_read () is % issued by the reader n where 1 < i < r % 

(N2) ss_broadcast READ (false) to all servers; 

(N3) wait (messages ACK_READ ( last-val , helpingjval ) received from (n — t) different servers); 

(N4) if ((2t + 1) of the previous messages have the same helpingjval A _L) 

(N5) then let ( wsn , v) be this value; 

(N6) if (pwsn >cd wsn ) then pwsn t— wsn : pv <— v end if % sanity check for pwsn and pv % 

(N7) end if; 

m newjread r «— true; 

( 1081 while (true) do 

@9} ss_broadcast READ (newjreadr) to all servers; 
newjreadr «— false; 

03 wait (messages ACK.READ (last jval, helping jval) 

received from (n — t) different servers); 
d 1 21 if ((2 1 + 1) of the previous messages have the same lastjval ) 

GIMl) then let ( wsn , v) be this value; 

dl3M 2) if ( wsn > c d pwsn) then pwsn <— wsn : pv <— v ; return(i;) 

dl3M 3) else return (pv) % prevention of new/old inversion % 

dl3M4) end if 

nl else if ((2t + 1) of the previous messages have the same helping jval ^ _L) 

GSM) then let ( wsn , w) be this value; pwsn •<— wsn : pv <— w: return (w) % already atomic % 

03 end if 

dm end if 
QD end while. 


when WRITE (sn, v) is ss_delivered from p w do % v is now a pair (seq. nb, value) % 

03 lastjvali <— w, 

d20i send ACK.WRITE (helpingjvaU) to p w . 

when new_help_val (v) is ss.delivered from p w do % v is now a pair (seq. nb, value) % 
<EQ helping.vah <— v. 

when READ (new-read) is ss.delivered fromp r do 
d22b if (newjread) then helpingjvah -f— _L end if; 

03) send ACK.READ (lastjvah, helpingjvali) top r . 


Figure 3: Byzantine-tolerant practically stabilizing SWSR atomic register 


5 Construction of Stabilizing SWMR and MWMR Atomic Registers 

5.1 Construction of a Stabilizing SWMR Atomic Register 

The technique to obtain a SWMR atomic register from SWSR atomic registers is a classical one fl3l 
021. The writer interacts with each reader, writing the same value to all readers, the servers maintaining 
variables for each reader. Since the result is atomic register for each reader, and any write is executed to 
all, then the result is a single-writer multi-reader register. Let swmr_write() and swmr_read() denote the 
operations associated with such a SWMR atomic register. 

5.2 Construction of a Stabilizing MWMR Atomic Register 

This section presents a stabilizing algorithm that implements a multi-writer multi-reader atomic register 
in the system model introduced in Section [2711 

Underlying SWMR atomic registers It is assumed that each process is both a reader and a writer. 
Hence, in the following we use the term “process”. Let m be the number of processes. A process is 
denoted pi, 1 < i < m. The construction uses one stabilizing SWMR register per process. Let REG[i\ 
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be the SWMR register associated with pi, which means that any process can read it but only p, can write 
it. 

To write REG[i\, pi invokes REG[i\ ,swmr_write(n, epoch , seq ), where epoch is a bounded label (see 
below), and seq is a sequence number bounded by some large constant 2 64 . Any process p :] reads REG[i\ 
by invoking fFEG'[i].swmr_read(). Such an invocation returns a triple (v, epoch , seq), where v is a data 
value, whose associated timestamp is the pair {epoch, seq). 


operation mwmr.write (v) is % issued by process pi % 

(01) for j E {1, m} do reg % \j] Ri^GO].swmr_read() end for; % obtains m triples ( val, epoch, seq) % 

(02) if ((^ max_epoch(re<7i[L- m ])) V (3j : [( regi[j].epoch — max_epoch(repi[l-- m D) A (regi[j\.seq > 2 64 )])) 
(03) then regi[i] <— {v, next_epoch(regi[l..m]), 0) 

(04) end if; 

(05) let M be the set of indexes j such that max_epoch(regi[l-.TO]) = regi\j].epoch; 

(06) seqmax <- ma x(regi[j].seq, j E M); 

(07) REG[i]. swmr_write(u, max_epoch(regri[l..m]), seq ma x + l); 

(08) return (). 

operation mwmr.read () is % issued by process pi % 

(09) for j G {1, m} do regi[j] <— Rf?G[j].swmr_read() end for; % obtains m triples (val, epoch, seq) % 

(10) if {(fl max_epoch(regfi[l-- m ])) V (3jr : [(regi[j].epoch = max_epoch(regfi[l-- m ])) A (regi[j\.seq > 2 64 )])) 

(11) then regi[i\ <— (regi[i].v, next_epoch(regi[L.m]), 0) ; REG[i\.svjmrjMr\te(regi[i].v, regi[i].epoch, 0) 

(12) end if; 

(13) let M be the set of indexes j such that max_epoch(repi[l-- m ]) = regi\j].epoch\ 

(14) seq max ma x(regi[j].seq,j G M); 

(15) let min G M be the minimal index such that regi[min].seq = seqmax\ 

(16) return (regi[min].v). 


Figure 4: Byzantine-tolerant stabilizing MWMR atomic register from SWMR registers 


The notion of an epoch This notion was introduced in fl] where a bounded labeling scheme is proposed 
with uninitialized values. Let k > 1 be an integer, and let K = k 2 + 1. We consider the set X = 
{1, 2, K} and let C (the set of epochs) be the set of all ordered pairs (.s, ,4) where s € X and A C X 
has size k. 

The comparison operator >- among two epochs is defined as follows: 

(si,Ai) >- ( Sj,Aj ) = {sj £ Ai) A {si Aj). 

Note that this operator is antisymmetric by definition, yet may not be defined for every pair (sj, Ai) and 
(sj, Aj) in L (e.g., Sj € Ai and s, £ Aj). 

Given a subset S of epochs of C, a function is defined in |[Q which compute a new epoch which is 
greater (with respect to ^) than every label in S. This function, called next_epoch(), is as follows. Given 
a subset of k epochs (si, A x ), {s 2 ,A 2 ), • • •, {s k ,A k ), next_epoch((si, Ai), {s 2 ,A 2 ),..., {s k ,A k )) is the 
epoch (s, A) that satisfies: 

- s is an element of X that is not in the union A\ U A 2 U ... U A k (as the size of each A s is k, the 
size of the union is at most k 2 , and since X is of size k 2 + 1 such an s always exists). 

- A is a subset of size k of X containing all values (si, s 2 ,..., s k ) (if they are not pairwise distinct, 
add arbitrary elements of X to get a set of size exactly k). 

The relation >- is extended to A as follows: 

(si,Ai) y ( Sj,Aj) d = (( Si,Ai) y ( Sj,Aj)) V {{si = Sj) A (A* = Aj)). 

The predicate max_epoch() applied to a set of epochs returns true if there is an epoch in the set such 
that is equal to or greater (in the sense of the relation A) than any other epoch in the set. 

Algorithm implementing mwmr_write() When a process p r invokes mwmr_write ('/;), it first checks 
if it has to start a new epoch (lines l0TH04l> . in which it first reads all the underlying SWMR registers 
REG[l..m], and saves their values in its local array re//,;[l..m] (linelOTl). This constitutes its view of the 
global state. Hence, for any j £ {1,..., m}, reg,\;j} contains a triple (v, epoch , seq), namely, regi\j].v is 
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the data value of REGj, regi[j].epoch is the epoch of the timestamp of v, and regi[j].seq is the sequence 
number of the timestamp of v. 

Then, if there is no greatest epoch in regi[l..m], or there is one (reg t \j].epoch), but the associated 
sequence number ( , reg l \j] .seq) is equal to or greater than the bound 2 64 , p. L must start the next epoch 
(ne = next.epoch (reg t [1. .m]) with starts with the sequence number 0, and informs the other processes. 
To this end pi writes the value v and its timestamp (ne, 0) in REGi[i\. 

Then pi writes the value v with its epoch and sequence number (line l07l). The pair (epoch, sequence 
number) is computed at lines lQ5ll()7l so that it is greater than all the previous pairs known by p, . 

Algorithm implementing mwmr_read() The algorithm implementing the operation mwmr_read () is 
nearly the same as the one implementing the operation mwmr_write (). The lines lQ9lfl2l are the same as 
the lines lQTll04l except line [TT] where pi writes into the timestamp of reg, [i] a new epoch. 

The second difference is at lines [T4lfT6l where the value returned by the read operation is computed. 
This value is the one associated with the greatest epoch known by pi and the greatest sequence number, 
and where process identities are used to do tie-breaking (if needed). 

Proof Due to page limitation, the proof of the previous construction is given in Appendix [Cj 

6 Conclusion 

This paper was on the implementation of stabilizing server-based storage on top of an asynchronous 
message-passing system where up to t servers can exhibit a Byzantine behavior. A first basic algorithm 
was represented, which implements a single-writer single-reader regular register stabilizing after the first 
write invocation. This algorithm tolerates t < n/8 if communication is asynchronous, and t < n/3 if it 
is synchronous. This algorithm was then extended to obtain a practically stabilizing atomic single-writer 
single-reader register. Finally, the paper presented a generalization allowing any number of processes to 
read and write the practically stabilizing atomic register. 

This paper, together with O, is one of the very first to address the construction of a read/write register 
in an asynchronous system where all servers can experience transient failures, and some of them can 
behave arbitrarily. While the algorithms presented in (5J, require the “operation quiescence” assumption, 
and build only regular registers, (as already noticed in the introduction) our constructions are the first that 
build a distributed atomic read/write memory on top of asynchronous servers, which communicate by 
message-passing with the readers and writers processes, can suffer transient failures, and where some of 
them can exhibit a Byzantine behavior. 
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A SWSR Regular Register in a Synchronous Communication Setting 


This section presents and proves correct an algorithm, which builds a stabilizing SWSR regular register, 
in a synchronous system where up to t < n/3 servers can commit Byzantine failures. 

As aleady indicated in Section l33l synchronous means here that there is an upper bound on message 
transfer delays on each link connecting a process (reader or writer) and a correct server. Moreover, this 
bound is known by the processes. Hence, both the reader and the writer know how long they have to wait 
for messages from all correct servers, and can consequently use timers with appropriate timeout values. 

The corresponding algorithm is described in Figure [5J which is a simple adaptation of the basic algo¬ 
rithm of Figure [2] The modified lines are suffixed with the letter M. 

Due to the link synchrony property, we have the following. When the writer writes a value x to the 
correct servers (which are at least (2 1 + 1)), and then starts another write of a value y, as it obtains values 
from all correct servers, a concurrent read obtains at least (t + 1) messages carrying x, or at least (f + 1) 
messages carrying y. More generally, if the writer is faster than the reader, it assists the reader to find 
(2t + 1) identical non-T values, writing the same value at all correct servers. The reader can then read at 
least ( t + 1 ) identical non-_L values in the helping_val field of the messages it receives from all correct 
servers, and is able to return a correct value. 


operation write (v) is % issued by the writer p w % 

ED ss.broadcast WRITE ( v ) to all servers; 

d02IM) wait (messages ACK.WRITE (helpingjval) received from n different servers or time-out); 
E3]M) if -i(3 w A -L such that helping jval = w for (t + 1) of the previous messages) 
j04b then ss_broadcast NEW_HELP_VAL(u) to all servers 

( f05l) end if; 
d return (). 

operation read () is % issued by the reader p r % 

ED newjreadr <— true; 

| |081 while (true) do 

ED ss.broadcast READ (newjread T ) to all servers; 

d new-read r <— false; 

GUM) wait (messages ACK.READ ( lastjval , helping jval) 

received from n different servers or time-out); 
d!2IM) if (( t + 1) of the previous messages have the same lastjval) 

dT3l) then let v be this value; return (v) % the value returned is regular or atomic % 

d!4I M) else if ((t + 1) of the previous messages have the same helpingjval A -L) 

d 1 5b then let w be this value; return(w) % the value returned is atomic % 

Gbl i end if 

dTTl i end if 

dm end while. 


when write (t) is ss_delivered frompu, do 

d lastjvah <— v\ 

d send ACK.WRITE (helpingjvah) to p w . 

when new_help_val (v) is ss.delivered fromp™ do 

CD helpingxvali <— v. 

when read (new.read) is ss_delivered from p r do 

d22b if (newjread) then helpingjvah •<— _L end if; 
d send ack_read (last-vah, helpingjvah) to p r . 


Figure 5: Byzantine-tolerant stabilizing SWSR regular register, (semi-synchronous links and t < n/3) 

The proof is a straightforward adaptation of the proof of Section 13.21 which takes into account the 
synchrony assumption. It assumes t < n/3. 

Lemma 4. Any invocation of write () terminates. 

Proof Due to the ss-broadcast termination property, the writer cannot block forever when it invokes 
ss_broadcast() at line [OT] or line [04l As far the wait statement of line [02] is concerned, we have the 
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following. Due to the ss-broadcast eventual delivery property, at least (n — t) non-Byzantine servers 
ss-deliver the message WRITE0 ss-broadcast by the writer, and send it by return the acknowledgment 
message ACK_write(), which concludes the proof of the lemma. □ Lemma [ 4 ] 

Lemma 5. Any invocation of read () terminates. 

Proof Using the same reasoning as in Lemma [Q it follows that the reader cannot block forever in the 
wait statement of line [TT| So, the proof consists in showing that the predicate of line [121 or the one offl4l 
becomes eventually true. The rest of the proof is by contradiction. R being the first invocation of read () 
that does not terminate, let us consider an execution of the loop body after time T sta i,. 

Claim C. At the time at which a write that started after r n „j r terminates, there are (a) at least (n — t = 
( 2 t + 1 )) correct servers s t such that lastjual, = v, and (b) at least it + 1 ) correct servers Sj such that 
helpingjualj = w / _L. 

Proof of the claim. It follows from the synchronized delivery property of the ss-broadcast of the message 
WRITE0, and the fact that no correct server suffers transient failures after T no _t r , that, when a write 
that started after T no _t r terminates, there are at least (n — t ) correct servers s t such that lastjuof = v. 
Moreover, if the predicate of line [03] is true, it follows from (a) the synchronized delivery property of the 
ss-broadcast of the message new_help_val() (linelOdl). and (b) the fact that n — t > t + 1, that at least 
[t + 1) correct servers Sj are such that helpingjualj = w f - L. If predicate of line [03] is false, there are 
((2 1 + 1)) servers that sent ACK_write(u;) where w f _L (line [20]), from which we conclude that there 
are at least it + 1) with helping jualj = w f _L. End of the proof of the claim C. 

Let us consider the last write that terminated before R started, and let us assume it wrote x. Due 
to part (a) of Claim C, just after this write terminated, all the (n — t) correct servers s t are such that 
last-vali = x. If no write is concurrent with R, as R receives messages ACK_READ [lastjual, —) from 
(n — t ) correct servers at line [IT] it follows that R terminates at 1 i nc d 1 2HT31 

Let us now assume that there is exactly one write that is concurrent with R, and let y be the value 
it writes. Due to the synchronized delivery property of ss-broadcast, R first resets to _L the variables 
helpingjvali of all n — t correct servers s t (lines [07l [09] and l22l) . and then receives (line [TQ) messages 
ACK_READ((asi_na/, —) from all the correct servers. We show that at least (t + 1) of these messages 
carry either x or y, from which R terminates at lines IT2HT31 Due to part (a) of Claim C, there were at least 
n — t > [2t + 1) correct servers s* such that lastjuali = x when the write of x finished. Let Q be this 
set of servers. R receives messages ACK_READ [lastjual, —) from all the ((2f + 1)) correct servers in Q. 
Due to the operation write(y) (concurrent with R), variables lastjuali of some of these servers may have 
been updated to the value y. Hence, some of the previous ((2 1 + 1)) messages ACK_READ(/ast_na(, —) 
received by R carry x, while others carry y. Hence, at least [t + 1) of them carry either x or y, and R 
terminates at lines IT2¥l~3l 

Let us finally consider the case where there are more than one write concurrent with R. When R 
terminates its invocation of ss_broadcast READ(true) (there is only one such invocation per read, linel09l). 
the local variables helpingjuali of ( n—t) correct servers are equal to _L. Let Q' be this set of servers. (The 
proof of this statement is the same as the proof appearing in the first part of claim C.) Hence, when this 
ss-broadcast terminated, the messages ACK_READ(— , helpingjuali) sent by each server s t £ Q' (line [23]). 
is such that helping juali = _L. Let us consider the first write (e.g., write(z)) that occurs after the servers 
Si £ Q' have set helpingjuali to _L. This write receives (n — t ) messages ACK_WRITE [helpingjual), 
and at least ((2 1 + 1)) of them are from servers in Q', and carry consequently helping jual = _L. Hence 
the predicate of line [03] is satisfied, and the writer issues ss_broadcast new_help_val(z). If later (i.e., 
after the invocation of write(z) terminated), there are other invocations of write() concurrent with R, none 
of them will execute line [04] This is due to the fact that R does not reset the variables helping jvali to _L, 
and the (n — t ) messages ACK_WRITE [helpingjual) sent by the servers at line[20]are such that at most t 
are from Byzantine servers, and at least [[2t +1)) carry z, from which follows that there is a finite time tr 
after which the variables helpingjuali of the correct servers are no longer modified. Let us finally consider 
the first invocation of ss_broadcast ( newjread r ) issued by R after 77 j, such that newjread r = false. 
It follows from the previous discussion that, among the (n — t) messages ACK_READ helping jual) 
received by R, at most t (the ones from Byzantine servers) carry arbitrary values, and at least (n — t) carry 
the value z. When this occurs, R terminates at lines n~4in~5l ^LemmaPrl 








Lemma 6. Let t < n/3. There is a finite time T s t a b > t\ w after which each read invocation R returns a 
value v that was written by a write operation W, which is (a) the last write operation executed before R, 
or (b) a write operation concurrent with R. 

Proof Let us assume that a read operation R returns z, a value different from the value v of the last 
completed write prior to R, and from any value u of a concurrent write. Let us consider the first write 
concurrent with R. For R to return z, the reader must receive (t +1) messages ACK_READ(z, —) or (t +1) 
messages ACK_read(—, z). Flowever, immediately following the termination of the write of v there were 
( n — t ) correct servers .s, with last.val, = v. Thus, following the termination of the write of v, and until 
the termination of the next write of some value u, the reader cannot receive (t + 1 ) values for a value 2 
different from v and u. The above argument holds for the second concurrent write, where we start with 
(n — t) values of u, and so on and so forth. n Lemma [6] 

Theorem [2] Let t < n/3. The algorithm described in Figure [5] implements a stabilizing regular register 
in the presence of at most t Byzantine servers. 

Proof The proof follows from Lemma|4j Lemma[5] and Lemma[ 6 ] 


^Theorem [2l 


B Proof of the Stabilizing SWSR Atomic Register (Section 3]) 

Lemma 7. Any invocation of a prac_at_write() operation terminates. 

Proof Let us suppose by contradiction that there exists a prac_at_write() operation op w invoked by the 
writer p w and that op w does not terminate. If such operation does not terminate, it means that p w never 
executes line [06] in Figure [3] Let us note that, due to the ss-broadcast termination property, p w cannot 
be blocked while sending messages. Thus, the only point where p w can be blocked is executing line [02] 
in Figure [3] while waiting for the delivery of ACK_write() messages. An ACK_write() message is sent 
by a server when it delivers a writers, n) message (line [20] Figure [3]) that is in turn sent by p w at the 
beginning of the prac_at_write() operation (line lOTM. Figure[3]). Due to the eventual delivery property of 
ss-broadcast, we have that eventually n — t correct servers will deliver the WRITE0 message sent by p w 
and will send back an ACK_write() message. Thus, considering that links connecting each server to the 
writer is FIFO reliable, we have that p w will eventually deliver at least n — t ACK_write() messages. 
Therefore, we have a contradiction and the claim follows. □ Lemma [ 7 ] 

Lemma 8 . Let op w be a prac_at_write(V) operation invoked by the writer p w at some time ts(op w ) > 
t no.tr, let wts be the sequence number associated to the operation and let t E (op w ) be the time at which 
op w terminates. At time tE(op w ) there exist at least (n — 2 1) correct serx’ers that store locally in their 
lastjuali variable the pair (v, wts). 

Proof Due to Lemma [7] we have that time t E (op w ) exists. Let us now show that at that time, at least 
(n — 2 1) correct servers store the pair (v. wts). The writer p w returns from the prac_at_write(n) operation 
only after it is unlocked from the wait statement in line [02] If p w is unblocked, it means that it delivered 
at least (n — t) ACK_write() messages from n — t different servers. An ACK_write() message is sent 
by a server s t when it delivers a WRlTE(n) message and just after it updated its local copy of the register 
with the value and the sequence number contained in the write(u) message (line [T9] Figure 0). Let us 
denote as T up d a te suc h a time. Considering that (i) both ss-broadcast and the FIFO link involved in such 
a message pattern do not create messages, (ii) the value and the sequence number are communicated to 
Si directly from the writer, (iii) among the (n — ^messages ACK_write() received by p w , at most t are 
from Byzantine servers, and (iv) T update < t E (op w ), the claim follows. □ iemma | 

Lemma 9. Let op w be a prac_at_write(u) operation invoked by the writer p w at some time ts(op w ) > 
Tno.tr, let wts be the sequence number associated to op w and let t E (op w ) be the time at which op w termi¬ 
nates. At time t E (op w ) there exist at least (At + 1) correct servers that store locally in their helping.vali 
variable the same pair (v ', ts). 


iv 


Proof Due to Lemma[7] we have that time tj?(op u ,) exists. Let us now show that at that time, at least 
in — 2f) correct servers store the same pair ( v',ts ). The writer p w returns from the prac_at_write(n) 
operation only after it is unblocked from the wait statement in line [02] If p w is unblocked, it means that 
it delivered at least (n — t) ACK_write(/w) messages from (n — 0 different servers. Thus, p w received 
at least (n — 0 helping values, stored locally at the servers, from (n — 0 different servers. Let t ( ] f d be the 
time at which p w is unblocked from the wait statement in line [02] and evaluates the condition in line 1031 
Two cases can happen: the condition at line [03]is (i) true, or (ii) false. 

• Case 1: The condition in line\03\is true. In this case, it means that among the (n — 0 received 
helping values, there not exists a value w / _L occulting a majority of time. This means that helping 
values stored at each server s t during the current prac_at_write() operation are corrupted values and 
need to be cleaned. Thus, at time tdei, the writer p w broadcasts a NEW_HELP_VAL (wts, v ) message 
that will trigger the update of the helpingcvali variable (line [2Tb. Considering that ss-broadcast (i) 
does not modify the content of messages, (ii) guarantees that at least (n — 2 1) correct servers deliver 
the message before the end of its invocation, and (iii) p w returns form the prac_at_write(u) operation 
only after the termination of the ss-broadcast, it follows that at least (n — 20 correct servers stored 
the same pair ( v , wts) in their helpingjvali local variable before the end of the operation. As 
n > 8t, the claim follows. 

• Case 2: The condition in line\03\is false. In this case, the claim directly follows as the writer 

found (4f + 1) same values. n 
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Lemma 10. Any invocation of a prac_at_read() operation terminates. 

Proof Let us suppose by contradiction that there exists a prac_at_read() operation op r invoked by the 
reader p r and that op r does not terminate. If such operation does not terminate, it means that p r never 
executes line [T3M2 or line IT31VI3 or line IT5TV1 in Figure [3] Let us note that, due to the ss-broadcast ter¬ 
mination property, p r cannot be blocked while sending messages. Thus, the only points where p T can be 
blocked is (i) while executing lineQjjin Figure [3]keep waiting for the delivery of ACK_read() messages 
or (ii) cycling for ever as the set of ACK_READ() messages received by clients never contains two values 
x and y such that x is the last value reported by at least (2t + 1) servers or y is the helping value reported 
by at least (2 1 + 1) servers. 

Case I: The reader remains blocked while executing line Q7] in Figure\3\ If the reader is blocked while 
executing line[TT]in Figure [3] it means that it never delivers at least (n — 0 ACK_READ() messages from 
servers. An ACK_READ() message is sent by a server when it delivers a READ0 message (line [23] Fig¬ 
ure [3]) that is in turn sent by p r at the beginning of the read() operation (line [09] Figure [3]). Due to the 
eventual delivery property of ss-broadcast, we have that eventually (n — 0 correct servers will deliver 
the READ0 message sent by p r and will eventually send back a ACK_read() message. Thus, consider¬ 
ing that links connecting each server to the writer is FIFO reliable, we have that p r will eventually deliver 
at least (n — t) ACK_read() messages. Therefore, we have a contradiction and this case can never happen. 

Case 2: The reader never collects (2t + 1) copies of the same last value or it never collects (2t + 1) copies 
of of the same helping value. Let us note that last values and helping values are sent from a server Si 
trough an ACK_read() message when it delivers a READ0 message (line [23] Figure [3]). 

Thus, if the servers is not able to find (2t + 1) same last values or (2 1 + 1) same helping values it means 
that there always exists (n — 0 servers answering with different values. Note that each server s,, updates its 
lastcuali variable while delivering a WRITE0 message sent by the writer and it updates its helpingjvali 
variable either during a write using values provided by the writer or during a read resetting such value 
to _L. Considering that, by assumption, there exists a prac_at_write() operation issued after time T no _t r 
we have that, due to Lemma [8] and Lemma [9] there exists a time r > T no _ tr at which the write terminates 
and such that at least (n — 20 correct servers store the same last value and such that at least (At + 1) 
correct server stores the same helping value. Let us show now that the prac_at_read() operation op r 
eventually terminates after time r. Let us consider the first READ0 message m broadcast by p r after time 
r. Two further cases may happen: (2.1) m is the first message sent by p r in the while loop (i.e., m is a 
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READ(true) message (linc[(l9l). or (2.2) m is the a-th message sent by p r in the while loop, with a > 1 
(i.e., m is a READ((f alse) message (line l09l). 

• Case 2.1. if m is a READ(true) message, it will trigger the update of the helpingjuali variable 
to _L at any correct server s t that will deliver it. Due to the Synchronized Delivery property of the 
ss — broadcast primitive, we have that at least (n— 2t) correct servers will update their helpingjuali 
variable. Considering that, at time r, we have (n — 2t) correct servers storing the same last value and 
considering that we have only one reader p r , it follows that such values can be modified concurrently 
with the broadcast only by the writer. So, if the writer is not going to modify such values, servers 
will answer to the broadcast by sending back last values stored at time r and the helping values just 
updated. Considering that messages are not altered by the network, the reader will receive at least 
n — 3t same last values and at least n — 3t same helping values. Thus, evaluating the condition in 
line |T2l the reader will find it true and it will terminate the operation either executing line IT3TV12 or 
line[T3lVI3. 

Contrarily, if the writer is going to update the last-vaf variables due to a concurrent write, the 
reader will find the condition in line [12] false as well as the condition in line [Q] Note that such 
concurrent write will be acknowledged by servers with at least n — 31: helping values equal to _L. 
This will entail the update of the helpingjuali variables with the value concurrently written. As 
a consequence, in the next iteration of the while loop, due to lemma [9j there will exist at least 
(4 1 + 1) correct servers with the same helping value different from _L. Thus, at least (2t + 1) 
will acknowledge the next READ() message making the condition in line fl4l true and letting the 
operation terminate. 

• Case 2.2. If m is a READ(f alse) message, it will just be acknowledge by servers with the current 
values stored locally in their lastjuali variable and in their helpingjuali variable. Considering that, 
at time t, we have (n — 2 1) correct servers storing the same last value, we have at least (4f + 1) 
correct servers storing the same helping values and considering that we have only one reader p r , 
it follows that such values can be modified concurrently with the broadcast only by the writer. 
Depending on the value stored by the (4t + 1) correct servers (i.e., 1 or a different one) we fall 
down in the previous case or we have that the reader will find the condition in line fl4l immediately 
true. However, in both case we have the termination of the operation and the claim follows. 
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Lemma 11. Let T no _t r be the time after which no more transient failures happen. Let op wl be the first 
prac_at_write() operation issued after T no j r and let T\ w > T no _t r be the time at which op Wl terminates. 
Let Sw be the sequence o/prac_at_write() operations issued by p w and let S\y\op Wl be the sub-sequence 
of Sw starting with op wl . For each op Wi G Sw\op W i> let sni be the sequence number associated to the 
operation. For each pair op Wi , op Wi+1 of adjacent write in Sw\oPw\ we have that sni A sn^ 1 . 

Proof The claim simply follows by the definition of the precedence relation >, r ; considering that after 
time T no _tr the sequence number is generated only by the unique writer by incrementing the previous one. 

^ Lemma ED 


Lemma 12. Let t < n/8. There is a finite time r s t a b > T iw after which each prac_at_read() operation op r 
returns a value v that was written by a write operation op w , which is (a) the last prac_at_write() operation 
executed before op r , or (b) a prac_at_write() operation concurrent with op r . 

Proof Due to LemmaflOl we have that eventually each prac_at_read() operation terminates. Let us show 
in the following that there exists a time T sta b after which, each prac_at_read() operation terminates return¬ 
ing a valid value (i.e., the last value written or a value concurrently written). Without loss of generality, let 
us consider only prac_at_read() operations starting after time t\ w (i.e., considering only prac_at_read() 
operations following the end of the first completed write in the stability period). 

Let op w be the first prac_at_write(w) operation terminated after T sta b and let x be the sequence number 
associated to such operation and terminated at time t\ w . Let us consider a prac_at_read() operation op r 
issued at some time after t\ w . When executing op r , the reader p r sends a READ0 message to all servers 
that will answer by sending back their pair ( lastjuali , helpingjuali) (line [23} Figure [3]). Note that, due 
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to Lemma[8j at time t\ w , there exist at least (n — 2 1) coiTect servers storing the same pair (v, x) in their 
last-vali variable and, due to Lemma[9j at time t\ w , there exist at least (At + 1) correct servers storing 
the same pair (v 1 , x') in their helping-vali local variable. 

If there is no concurrent prac_at_write() operation, it means that servers will answer by sending back 
the value (v,x) and the pair (v',x'). In order to select a value to return, the reader waits for (n — t) 
messages, that t answers may arrive from Byzantine servers and t may arrive from servers that are not 
yet updated, we have that only n — 3t values are guaranteed to arrive from correct and updated servers. 
Considering that n > 8t we hate that at least 5t + 1 messages arrives from correct and updates servers. 
Thus, evaluating the condition in line |T21 Figure [3] p r will find it true and will check whether x is smaller 
or greater than its current local sequence number. Two cases may happen: (1) pwsn > cd x or (2) x > cd 
pwsn. 

• Case 1: pwsn > c d x. In this case the reader executes line [T31VI3 returning the value locally stored 
that can be a corrupted one. Let us remark that since op r is the first read executed after the stabi¬ 
lization, we may have that executing lines N2 -N7, p r collects helping values that are still corrupted 
and set the its local sequence number to a value that is corrupted. However, this happen only this 
time as from this time on, the only process that will generate sequence number for write operation is 
the writer. Considering that the such sequence number is generated by incrementing each time the 
old one (see Lemma fTTl). we have that such a scenario may happen a finite number of time. Thus, 
eventually the writer will use a sequence number that is greater equal than the current one and we 
will have that eventually a read returns a valid value. 

• Case 2: x > cd pwsn. In this case the reader executes linefl3lV[2 returning the last written value and 
the claim follows. 

Let us note that, due to the enforcement of the helping value by the writer, we obtain, in case of concurrent 
writes, the scenario described so far, and the claim follows. □ Lemma fT2l 

Lemma 13. Let t < n/8. There is a finite time T s t a b > T iw after which any prac_at_read() having less 
than 2 63 + 1 prac_at_write() concurrent operations returns a regular value and no two invocations of 
prac_at r ead() return new/old inverted values. 

Proof Eventual validity follows from [12] thus, in the following, we just need to prove that there ex¬ 
ists a time T sta b > T \w after which no new/old inversion happens. Let us suppose by contradiction that 
there exists two prac_at_read() operations op r 1 and op r 2 such that op, 1 happens before op r 2 op r 1 re¬ 
turns a value v\ and op r 2 returns a value v- d and prac_at_write(t >2 j happens before prac_at_write(ni). If 
prac_at_write(t' 2 ) happens before prac_at_write(ni) it means that sn\ > cr / sn 2 - Note that, if op r l returned 
value v\, it means that p r executed line IT31V13 ore line [T3]VI. However, in both cases, before returning v \, 
p r updated its current local sequence number to sn \. Thus, executing op r 2, evaluating the condition in 
line n~3TV12. p r will find it false and will execute line IT3TVI3 returning v i and we have a contradiction. 

Note that, the local sequence number can be reset to a value smaller than sn\ only if, executing line 
N6, p r found the condition true. However, this happen if and only if the writer sequence number wrapped 
around as there are more than 2 63 + 1 concurrent operations and the claim follows. □ j A . rnrn „ [TTtl 

Theorem 3. Let t < n/8. The algorithm described in Figure [3] implements a Byzantine-tolerant practi¬ 
cally stabilizing SWSR atomic register. 

Proof The proof follows from Lemmas [7TfT3l 1=1 Theorem [3] 

The synchronous link version has an analogous proof for t < n/8. 


C Proof of the Stabilizing MWMR Atomic Register (Section 1 5]) 

The proof of both the next lemmas is straightforward, as the code of mwmr_write () and mwmr_read () is 
sequential. 

Lemma 14. Any invocation o/mwmr_write () terminates. 
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Lemma 15. Any invocation o/mwmr_read () terminates. 

Definition 1 (Total order relation Au,). Let Wi, timestamped (epochs , seep), be any write issued any 
process p^ and Wj timestamped (epochj, seqj) be any write issued by any process pj. Wj >-to 1Tj iff 
(epochj >- epochi) V ((epochj = epochf) /\(seq,j > seqfi) V ((epochj = epochf) A (seqj = seqi)/\(j > 
»))■ Moreover, Wj F to Wj = ((Wj F to Wj) V (W t = Wj)). 

Lemma 16 (Total order on writes). There is a finite time t that follows either a non concurrent write or 
non concurrent read, such that all write operations invoked after r are totally ordered. 

Proof First notice that non concurrent write or read enforces the existence of a greatest epoch which all 
subsequent read and write identify. Let S be the set of writes on the register happened after r. We will 
prove in the following that A /o is a total order on S: 

• fto reflexivity, W t Ri„ Wi, follows directly from the definition. 

• hto antisymmetry, (Wi Ru, Wj) A (Wj A /o Wj) implies W = Wj. Since Wi and Wj happen after 
r it follows that the above relations reduce to (i > j ) A (j > i). Flence i = j and Wi = Wj. 

• fii.o transitivity, (Wi ht 0 Wj) A (Wj hto M4) implies (Wj F to Wk). This follows directly from 
the definition and the fact the invocation time is after r. 

• fii.o comparability, for any Wj and Wj in S, (Wj A /o Wj) or (Wj A /o Wj). This follows directly 

from the definition and the fact that the writes happen after the r. n 
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Lemma 17 (Regularity). There is a finite time that follows either a non-concurrent write or non-concurrent 
read, after which each read invocation R returns a regular value. 

Proof In the following we prove that value v returned by R is the value that was written by a write 
operation W, which is (a) the last write operation executed before R, or (b) a write operation concurrent 
with R. Following Lemma [16] there is a time r such that all writes invoked after r are totally ordered. 
Let R be a read operation that happens after r. Let W be the last writer in that order that modified 
the register after r before R started. W either happened before R or is concurrent with R. The reader 
R reads first all the SWMR registers and stores their values in the vector reg. Let k be the index of 
the SWMR register corresponding to W. Since, W is the last writer on the register according to A /o it 
follows that reg[k\.epoch = max_epoch(regj[l..m]) and reg[k\.seq > reg[j\.seq,Mj,reg[j].epoch = 
max.epoch (re.r/i [1. .m]) (lines lQ9lfT0l Figure |4} and k is the minimal with this property. It follows that R 
returns reg[k].v which is the value written by W. a Lemma fT71 

Lemma 18 (No new/old inversion). There is a finite time r that follows either a non-concurrent write or 
non-concurrent read, after which read invocations do not return new/olcl inverted values. 

Proof Following Lemma [16] there is a time r such that all writes invoked after r are totally ordered. Let 
R\ and R-z be two read operations that happen after r and let W\ and W 2 that also happen after r. Assume 
also that R\ happens before R 2 , W\ happens before W 2 (and no other write happens after W\ and before 
Wf), R\ is concurrent with W\ and W 2 and IR is concurrent with IF 2 . Assume a new/old inversion on 
R\ and IR. That is, If returns the value written by W 2 and IR returns the value written by W\. 

Let ml be the index in reg r 1 that stores the state of the register modified by Wj. Let m2 be the 
index in regn 2 that stores the state of the register modified by W\. Since W\ happens before IF 2 then 
[ml].epoch A regp 2 [m2].epoch or regpfimT).epoch = regp 2 [m2].epoch and regpfimlj.seq > 
regn 2 [m,2].seq. It follows that we have regp 2 [ml], epoch A r eg p 2 [m2],epoch, or r eg p 2 [ml\.epoch = 
regR 2 [m2].epoch and regp 2 [ml].seq > regjp (rri2].seq. Hence, IR has to return the value stored at 
the index ml which corresponds to the value written by W 2 . This contradicts the new/old inversion 


assumption. 



Theorem 4. Let t < n/8for the asynchronous version and t < n /'3 for the link synchronous version. The 
algorithm described in Figure [4] implements a Byzantine-tolerant stabilizing MWMR atomic register. 


Proof The proof follows from Lemmas [T4lfT8l 


^Theorem [T1 


Vlll 




